fix(turso): replace client.transaction() with UPDATE...RETURNING to prevent native connection leak in queue poller#41
Open
CalmProton wants to merge 1 commit intomizzle-dev:mainfrom
Conversation
…revent connection leak in queue poller @libsql/client's transaction() method sets this.#db = null after BEGIN so that concurrent execute() calls use a separate connection. After commit/rollback, the detached Database object is abandoned to GC. At the default 100ms poll interval this orphans ~10 native SQLite connections per second — more than GC can reclaim — exhausting OS file-handle or native-heap limits and crashing the host process (exit code 5) after 5-10 minutes of idle operation. Replacing the three-step SELECT + UPDATE + COMMIT with a single atomic UPDATE...RETURNING via client.execute() preserves atomicity through SQLite's implicit per-statement transaction while reusing the single cached connection (execute() calls #getDb() without nulling #db). Fixes: https://github.com/mizzle-dev/workflow-worlds/issues/TBD
a3b0c3d to
987bb4f
Compare
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Problem
When
world.start()is called,@workflow-worlds/tursostarts a queue poller that fires every 100 ms. On each tick,pollAndProcess()callsclient.transaction('write')to atomically claim a message.This causes a native SQLite connection leak that crashes the host process after 5–10 minutes of idle operation.
Root cause:
@libsql/client'stransaction()orphans the connectionInside
@libsql/client@0.17.2(sqlite3.js:154-158):After
transaction()returns,this.#dbisnull. TheDatabaseobject is handed toSqlite3Transaction. Once the transaction commits or rolls back andtxgoes out of scope,dbis only reachable by GC. Meanwhile, the next#getDb()call (on the next poll tick) opens a brand new native SQLite connection.Result: ~10 orphaned
Databaseconnections per second at the default 100 ms poll interval. These accumulate faster than GC can reclaim them, exhausting OS file-handle or native-heap limits. The process crashes — typically with exit code 5 (V8 fatal error / resource exhaustion) — after 5–10 minutes.The leak occurs even when the queue is empty (zero pending messages), because the poll loop still opens a write transaction on every tick to check for work.
Note:
client.batch()does not have this behaviour — it uses#getDb()and never nulls#db.How to observe it
Set
WORKFLOW_DEBUG=truebeforeworld.start(). You will see a flood of[workflow:debug] Poll error: ...lines (one every 100 ms) if the schema has not been applied, or a silent connection drain if it has.Fix
Replace the three-step
transaction → SELECT → UPDATE → commit/rollbackwith a single atomicUPDATE … RETURNINGviaclient.execute().client.execute()calls#getDb()without setting#db = null, so the single cached connection is reused on every poll tick — no orphaned connection, no leak.Atomicity is preserved: SQLite wraps every top-level DML statement in an implicit transaction, so the subquery SELECT and the outer UPDATE are indivisible. Two concurrent pollers cannot claim the same message.
Testing
TypeScript type-check passes (
pnpm --filter @workflow-worlds/turso typecheck). Existing test suite applies.Impact
execute()instead oftransaction()+ twoexecute()calls +commit())